LedPred: Learning from DNA to Predict enhancers
نویسندگان
چکیده
2 Description 2 2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Learning from CRM-contained information to predict new regulatory features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2.1 Building the training set . . . . . . . . . . . . . . . . . . . 2 2.2.2 Optimization of support vector machine model . . . . . . 3 2.2.3 Definition of the optimal SVM parameters (γ and C) . . . 3 2.2.4 Sorting and selecting features according to their importance in the traing set description . . . . . . . . . . . . . 4 2.2.5 Plotting the performances of the model . . . . . . . . . . 4 2.3 Function description . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3.1 From bed CRM genomic coordinates to training set matrix 5 2.3.2 Scaling of the CRM feature matrix . . . . . . . . . . . . . 5 2.3.3 SVM parameter optimization . . . . . . . . . . . . . . . . 5 2.3.4 Features ranking . . . . . . . . . . . . . . . . . . . . . . . 6 2.3.5 Selecting features . . . . . . . . . . . . . . . . . . . . . . . 7 2.3.6 Creating the best model . . . . . . . . . . . . . . . . . . . 7 2.3.7 Plotting model perfomance . . . . . . . . . . . . . . . . . 8 2.3.8 Using the model to score unknown sequences . . . . . . . 8 2.3.9 From the matrix to the model in one function . . . . . . . 9
منابع مشابه
DELTA: A Distal Enhancer Locating Tool Based on AdaBoost Algorithm and Shape Features of Chromatin Modifications
Accurate identification of DNA regulatory elements becomes an urgent need in the post-genomic era. Recent genome-wide chromatin states mapping efforts revealed that DNA elements are associated with characteristic chromatin modification signatures, based on which several approaches have been developed to predict transcriptional enhancers. However, their practical application is limited by incomp...
متن کاملCorrection: LMethyR-SVM: Predict Human Enhancers Using Low Methylated Regions based on Weighted Support Vector Machines
BACKGROUND The identification of enhancers is a challenging task. Various types of epigenetic information including histone modification have been utilized in the construction of enhancer prediction models based on a diverse panel of machine learning schemes. However, DNA methylation profiles generated from the whole genome bisulfite sequencing (WGBS) have not been fully explored for their pote...
متن کاملIntegrating Diverse Datasets Improves Developmental Enhancer Prediction
Gene-regulatory enhancers have been identified using various approaches, including evolutionary conservation, regulatory protein binding, chromatin modifications, and DNA sequence motifs. To integrate these different approaches, we developed EnhancerFinder, a two-step method for distinguishing developmental enhancers from the genomic background and then predicting their tissue specificity. Enha...
متن کاملGenome-wide enhancer prediction from epigenetic signatures using genetic algorithm-optimized support vector machines
The chemical modification of histones at specific DNA regulatory elements is linked to the activation, inactivation and poising of genes. A number of tools exist to predict enhancers from chromatin modification maps, but their practical application is limited because they either (i) consider a smaller number of marks than those necessary to define the various enhancer classes or (ii) work with ...
متن کاملA synergistic DNA logic predicts genome-wide chromatin accessibility.
Enhancers and promoters commonly occur in accessible chromatin characterized by depleted nucleosome contact; however, it is unclear how chromatin accessibility is governed. We show that log-additive cis-acting DNA sequence features can predict chromatin accessibility at high spatial resolution. We develop a new type of high-dimensional machine learning model, the Synergistic Chromatin Model (SC...
متن کامل